Retrieval Augmented Generation (RAG)
What is RAG
RAG is a technique in natural language processing (NLP) that combines retrieval-based and generation-based methods.
- "reterival-based" means that, unlike purely generative models that create responses from scratch, retrieval-based systems rely on accessing existing data or knowledge to formulate their outputs
(image source)
Why RAG is needed
- LLM faces challenges
- domain knowledge
- hallucinations
- training date cut off (outdated training set)
- These challenges maybe can be tackled by LLM Fine-tuning, but it's expensive to train the LLM
(image source)
Data for RAG
RAG integrates with many types of data sources
- documents
- wikis
- expert systems
- web pages
- databases
- vector store (vector representation of text)
Data preparation for RAG
- data must fit inside context window
- data must be in format that allows its relevance to be assessed at inference time